Back

IEEE Access

Institute of Electrical and Electronics Engineers (IEEE)

Preprints posted in the last 30 days, ranked by how well they match IEEE Access's content profile, based on 31 papers previously published here. The average preprint has a 0.03% match score for this journal, so anything above that is already an above-average fit.

1
Exploratory Assessment of Pulsed-Wave Doppler Representations of Lung Sounds Using Deep Learning: An In-Vitro Phantom Study

Saad, A. A.; Murthi, S. B.; Boctor, E. M.; Teeter, W. A.; Seam, N.

2026-06-10 respiratory medicine 10.64898/2026.06.09.26353787 medRxiv
Top 0.1%
10.6%
Show abstract

The increasing availability of portable ultrasound systems motivates exploration of novel approaches to respiratory signal assessment. In this in-vitro study, we investigate whether pulsed-wave (PW) Doppler ultrasound can capture structured spectral patterns from replayed lung sound recordings. Digitized respiratory sounds were replayed through a tissue-mimicking ultrasound phantom, generating 1,478 PW Doppler spectral images from recordings associated with healthy subjects and several externally labeled disease categories. Exploratory classification experiments using a ResNet-18 architecture demonstrated that these Doppler representations contain learnable differences under controlled conditions. These findings motivate further investigation into PW Doppler as a potential representation of respiratory acoustics.

2
Smartphone Placement Recognition during Walking: Performance Determinants and Real-World Generalizability

Tasca, P.; Trentadue, G.; Buckley, E.; Sun, S.; Long, M.; Ireson, N.; Ciravegna, F.; Lanfranchi, V.; Cereatti, A.

2026-05-14 bioengineering 10.64898/2026.05.12.724503 medRxiv
Top 0.1%
4.4%
Show abstract

The opportunity to collect movement data from smartphones for prolonged periods has opened new perspectives in the field of clinical movement analysis. However, when monitoring peoples mobility in free-living conditions, smartphone placement can influence the validity of the extracted digital mobility outcome. This study aimed to develop and validate an automatic smartphone placement recognition classifier and to investigate potential critical factors that can influence performance. The classifier was trained on data from 15 healthy participants using inertial signals collected from smartphones placed at six body placements during free-living walking and externally validated on over 3,000 individuals from external datasets, including blind participants and patients with cardiovascular or Parkinsons disease. A decision-tree ensemble model was developed using feature subsets of increasing dimensionality, with the optimal subset comprising 50 features. Classification accuracy increased consistently when front and back pocket placements were aggregated (81.1%) and further improved when coat pocket was also included in the pocket class (88.5%), underscoring the challenge of distinguishing between fine-grained pocket placements. The best-recognized placements across the external datasets were lower back (precision: 100%, recall: 72.5%), hand (precision: 94.2%, recall: 94.5%), and the aggregated pocket class (precision: 86.7%, recall: 90.2%). Recognition accuracy changed across cohorts (0.73 - 0.85), activities (0.63 - 0.94) and speed (0.79 - 0.87), however it stayed consistent across various technological and environmental factors. Overall, this study demonstrates the feasibility of robust placement recognition in walking and underscores the importance of accounting for key influencing factors when designing frameworks intended for deployment in heterogeneous real-world or clinical contexts. HighlightsO_LIMachine learning accurately identifies smartphone placement during real-world gait C_LIO_LISix on-body placements recognized, including pockets, hand, bag, and lower-back C_LIO_LIFree-living data used for training, ensuring robust performance across conditions C_LIO_LIFeature selection and hyperparameter tuning optimize classification accuracy C_LIO_LIExternal validation confirms generalizability across >3,000 healthy and diseased adults C_LI

3
Precision Physical Activity Prescription via Reinforcement Learning for Functional Actions

Lin, G.; Miao, R.; Sacheck, J.; Zhang, X.

2026-05-21 public and global health 10.64898/2026.05.18.26353525 medRxiv
Top 0.1%
4.4%
Show abstract

Physical activity (PA) plays an important role in maintaining and improving health. Daily steps have been a key PA measure that is easily accessible with common wearable devices. However, methods are lacking to recommend a personalized optimal distribution of daily steps over a period of time for the best of certain health biomarkers. In this paper, we fill this void based on the data from the All of Us Research Program which includes months of step counts as well as repeated measurements of key health biomarkers. We develop a new offline reinforcement learning (RL) algorithm to learn personalized and optimal PA distributions associated with cardiometabolic risk, where the action is a function representing the daily step distribution over a period of time. Simulation studies demonstrate the advantage of the proposed approach over existing continuous-action RL methods. The learned optimal policy from the All of Us data generally suggests people take more daily steps and also follow a more consistent pattern of PA over time while offering tailored recommendations for subgroups in blood glucose level, body mass index, blood pressure, age, and sex.

4
A New Hybrid Method for Brain Tumor Detection Based on Deep Learning

Sharbaf, S.

2026-05-28 bioinformatics 10.64898/2026.05.25.727707 medRxiv
Top 0.1%
3.7%
Show abstract

Brain tumor detection using Magnetic Resonance Imaging (MRI) remains a challenging task due to tumor heterogeneity and imaging variability. This paper presents a novel hybrid Deep Convolutional Neural Network-Whale Optimization Algorithm (DCNN-WOA) framework for automated brain tumor detection and classification. The proposed method consists of four main stages: MRI data preprocessing and augmentation, deep feature extraction using multi-layer Convolutional Neural Networks (CNN), feature selection and hyperparameter optimization via the Whale Optimization Algorithm (WOA), and final classification with comprehensive performance evaluation. By jointly optimizing deep features and training parameters, the framework effectively reduces feature redundancy, accelerates convergence, and enhances model generalization. Experimental results on a publicly available MRI dataset demonstrate that the DCNN-WOA model outperforms conventional CNN and state-of-the-art Deep Learning (DL) architectures, achieving an accuracy of 97.8%, sensitivity of 96.4%, specificity of 98.1%, and F1-score of 97.2%. The practical impact of this approach makes it a promising solution for real-time clinical decision-support systems in neuroimaging.

5
CRADLE: A Clinically Robust, Anatomy-Aware Post-Processing Framework for Infant GMA Landmark Tracking in 2D Videos

Kaur, M.; Abbasi, H.; McMorland, A. J.

2026-05-19 bioengineering 10.64898/2026.05.16.725614 medRxiv
Top 0.1%
3.5%
Show abstract

Accurate pose estimation is central to automated infant General Movements Assessment during the fidgety period, when subtle limb movements, particularly at distal joints inform neurodevelopmental risks. Robust 2D pose tracking from handheld videos remains challenging in real-world settings, where occlusion, rapid motions, and visually ambiguous smaller joints frequently compromise anatomical accuracy. We present CRADLE, a clinically motivated, anatomy-aware post-processing pipeline designed to refine infant 2D movement trajectories across 24-anatomocal landmarks detected by our DeepLabCut-trained model. CRADLE integrates segment-length constraints, velocity-based anomaly detection, anatomically constrained interpolation, and Kalman filtering to correct both large localization failures and subtle persistent joint misplacements without relying primarily on confidence scores. Evaluations against conventional Confidence-Thresholding using Mean Absolute Error (MAE), {Delta}MAE, average Percentage of Correct Keypoints, and net keypoint correction rate showed consistently reduced or preserved error while maintaining accurate trajectories, with the strongest gains achieved at clinically important distal joints. Mean improvements reached up to 5 pixels for some smaller distal landmarks, large-magnitude corrections occurred more often than with Confidence-Thresholding, and well-localised joints remained largely unaffected. Positive net correction rates across metacarpophalangeal and metatarsophalangeal distal-landmarks further confirmed a favourable correction-degradation balance. By improving pose trajectory quality, CRADLE enhances the reliability of downstream movement analysis.

6
Anatomy-Guided 3D Graph Networks for Couinaud Segmentation in Tumor Affected Livers

You, L.; Dang, H.; Wang, H.; Matta, E.; zhou, X.

2026-05-14 bioinformatics 10.64898/2026.05.11.724316 medRxiv
Top 0.2%
2.4%
Show abstract

Image-based liver Couinaud segmentation is designed to automatically provide the locations of suspicious objects in liver CT/MR images. Once achieved, the physicians will be guided to the target slice and area where the suspicious node is located. However, conventional algorithms trained primarily on healthy liver images often fail to generalize to Hepatocellular Carcinoma (HCC) cases due to pathological structural distortions. In this work, we propose a robust two-stage framework that integrates a 3D Unet with a 3D Anatomical Structure-Guided Graph Convolutional Network (3D GCN). This two-stage strategy effectively isolates the liver volume to eliminate structural noise from neighboring organs, such as the spleen, allowing the framework to focus exclusively on the complex 3D anatomical relationships among the eight segments. To ensure the topological consistency required for global spatial reasoning, we implement a standardized preprocessing pipeline that normalizes liver-only volumes to exactly 50 frames along the z-axis. By combining a lightweight 3D UNet backbone with the 3D GCN for refined boundary reasoning, our model demonstrates superior generalization performance on unseen clinical datasets, achieving a mean Dice score of 0.828 in blind testing. By releasing our code and pretrained weights, we aim to provide the first publicly available deep learning resource for robust Couinaud segmentation.

7
Dual-Stream Compression of High Bit-Depth Medical Images with Application to DNA Storage

Su, H.; Fan, W.; Peng, J.; Zhang, Y.

2026-05-20 bioinformatics 10.64898/2026.05.17.724501 medRxiv
Top 0.2%
2.1%
Show abstract

High bit-depth medical images preserve subtle intensity variations that are important for quantitative analysis and clinical interpretation, but their large dynamic range poses challenges for efficient compression. We propose a bit-plane-aware dual-stream compression framework for 16-bit medical images by separately modeling the most significant bit (MSB) and least significant bit (LSB) components. The MSB structural stream is encoded using JPEG coding with a Duplicate Segment Skipping (DSS) strategy to exploit spatial and segment-level redundancy, while the LSB detail stream is compressed using learned image compression to represent residual variations and fine-grained details. Experiments on four MRI and CT datasets show that the proposed method consistently outperforms representative traditional and learning-based codecs, achieving the lowest bit rate across all datasets. Meanwhile, it preserves high reconstruction fidelity. As a downstream application, we further demonstrate that the compressed bitstreams can be effectively integrated with DNA encoding and converted into sequences with favorable biochemical properties.

8
Performance of Vision-Language Models for Zero-Shot Lung Nodule Detection on Chest Radiographs

Nishio, M.; Matsuo, H.; Matsunaga, T.; Fujimoto, K.; Deperrois, N.; Nooralahzadeh, F.; Frauenfelder, T.; Krauthammer, M.; Murakami, T.

2026-06-03 radiology and imaging 10.64898/2026.05.31.26354565 medRxiv
Top 0.3%
2.0%
Show abstract

Background and Objectives: The ability of vision-language models (VLMs) to detect lung nodules on chest radiographs remains uncertain. This retrospective study aimed to compare the zero-shot performances of six VLMs for lung nodule detection using data from the Japanese Society of Radiological Technology (JSRT) chest radiograph database. Methods: A total of 247 chest radiographs from the JSRT database (154 with nodules and 93 without) were preprocessed and evaluated using six VLMs: RadVLM, gpt-4o-mini, Qwen3-VL-8B-Instruct, MedGemma-4b-it, LLaVA-Rad, and CheXpert Plus Model. Each model was tested using a zero-shot setting. The text outputs were binarized into nodule-present or nodule-absent labels by consensus between the two radiologists. Sensitivity, specificity, accuracy, precision, and F1 scores were calculated. Pairwise differences in sensitivity, specificity, and accuracy were assessed using McNemar test with Holm correction. Results: The overall performance was limited across all models. RadVLM achieved the highest accuracy (44.5%, 110/247) with perfect specificity (100.0%, 93/93) and precision (100.0%); however, its sensitivity was low (11.0%, 17/154). LLaVA-Rad showed the highest sensitivity (27.3%, 42/154) and F1 score (37.7%), but lower specificity (71.0%, 66/93). MedGemma-4b-it achieved 100.0% specificity, with a sensitivity of only 5.2% (8/154). Grade-specific analysis showed that detection rates were highest for obvious nodules and remained limited for subtle nodules. Pairwise analyses revealed significant differences in sensitivity and specificity for the selected model pairs, particularly between RadVLM and LLaVA-Rad. Conclusion: Current VLMs show limited zero-shot generalizability for lung nodule detection in the JSRT database, with marked trade-offs between sensitivity and specificity. Their near-term value may lie more in radiologist-assisted workflows than in stand-alone detection. Clinical Impact: Current VLMs should not be used as stand-alone tools for lung nodule detection on chest radiographs because of their limited sensitivity and substantial model-dependent trade-offs. However, their high-specificity outputs in some models and higher-sensitivity behavior in others suggest potential roles in radiologist-assisted workflows, such as report drafting and second-reader support.

9
Conversational Speech for Respiratory Triage in Primary Care: A Pilot Study

Ravi, V.; Noufi, C.

2026-06-11 respiratory medicine 10.64898/2026.06.09.26355284 medRxiv
Top 0.3%
1.9%
Show abstract

Background. Respiratory complaints account for a substantial share of adult ambulatory care visits, and triaging them accurately has direct consequences for antibiotic stewardship and pathogen-specific therapy. Prior work has investigated voice as a triage signal, but that literature is dominated by single-condition detection from scripted speech in crowdsourced or controlled clinical settings and has not been evaluated at primary care scale on conversational ambient audio. Methods. A dataset of 514,377 ambient-recorded primary care visits from 379,225 adult patients at a US clinic network was used, with per-visit clinically assigned ICD-10 diagnosis codes and de-identified demographic and geographic metadata. Patient audio was extracted from each doctor-patient conversation, and spectral, voice quality, and prosodic features were computed. Eleven binary classification tasks were defined, aligned with a respiratory triage cascade (e.g., acute respiratory versus acute non-respiratory illness, and lower versus upper respiratory tract infection). An acoustic model (feed-forward network) was trained independently for each task using patient-stratified five-fold cross-validation and evaluated on a held-out test set. Each task's model was also compared against six non-acoustic baselines using a single demographic, geographic, or temporal variable. The 11 trained classifiers were composed into a hierarchical cascade and illustrated as case studies on selected patients. Results. Test-set AUC across the 11 tasks ranged from 0.602 (95% CI: 0.588-0.614) to 0.745 (95% CI: 0.742-0.748), with a mean expected calibration error of 0.018. Six of eleven binaries outperformed all confounder baselines. Four binaries showed median within-stratum AUC of 0.62-0.70 when the confounder was held fixed, indicating acoustic discrimination beyond what the confounder alone explains. The exception was the pneumonia versus non-pneumonia lower respiratory tract infection binary, which failed against the patient-city confounder baseline, plausibly reflecting a clinic-level difference in ICD-10 coding. Conclusion. Conversational primary care audio carries acoustic signal that discriminates clinically meaningful respiratory contrasts. Absolute performance is moderate, but the conditions are stricter than prior work: conversational speech and differential-diagnosis contrasts among sick patients. This pilot study is a baseline for voice-based clinical AI moving beyond sick-versus-healthy detection toward differential-diagnosis panels and a proof-of-concept for hierarchical reasoning.

10
Wearable and Interview-based Assessment of Psychological Risk in Alzheimers Caregivers: Machine Learning vs. Large Language Models

Xiao, J.; Zhao, Z.; King, Z. D.; Khalid, M.; Davies, S.; Zanna, K.; Argueta, D. L.; Brice, K. N.; Wu-Chung, E. L.; Lai, V. D.; Paoletti-Hatcher, J.; Denny, B. T.; Henry, S.; Schulz, P. E.; Fagundes, C. P.; Sano, A.

2026-05-27 psychiatry and clinical psychology 10.64898/2026.05.24.26353993 medRxiv
Top 0.3%
1.9%
Show abstract

Spousal caregivers of individuals with Alzheimers disease and related dementias frequently experience elevated perceived stress, caregiver burden, and loneliness, which are associated with adverse health outcomes. Early identification is therefore critical for timely intervention. Existing approaches commonly rely on wearable sensor data and standardized psychological questionnaires, while recent multimodal methods aim to improve prediction by integrating behavioral and linguistic information. In this study, we explored three modality configurations, wearable-derived features, interview-based text, and their combination, to classify caregiver psychological risk using the Perceived Stress Scale (PSS), Zarit Burden Interview, and UCLA Loneliness Scale. We compared traditional machine learning models and large language models (LLMs) (Gemini 2.0, Llama 4, and GPT-4o) under psychometrician-centered and caregiver-centered prompting strategies. Traditional machine learning models performed better under multimodal settings, while LLMs achieved stronger performance with Interview-Only input. We further demonstrate that PSS was the most predictable construct and prompting strategies substantially influenced LLM performance.

11
ReMind: A Retrospective Self-Report Paradigm for Studying Mind-Wandering Onset During Reading

Sun, H.; Birney, A.; Singh, N.; Olszko, A.; Chen, P.; Ke, J.; Rosenberg, M. D.; Jangraw, D. C.

2026-05-18 bioengineering 10.64898/2026.05.14.725227 medRxiv
Top 0.3%
1.8%
Show abstract

Mind-wandering (MW) is a frequent and pervasive phenomenon, yet it is commonly assessed using self-reports or probe-based methods that offer limited temporal precision regarding its onset. In this study, we introduce a novel paradigm, ReMind, that estimates the onset and duration of MW episodes during natural reading by combining retrospective self-reports with eye-tracking. Participants indicated the words where they believed their mind started and stopped wandering, and these reports were aligned with gaze timestamps to estimate MW onset. Using data from 44 participants, we examined whether knowledge of MW onset improves the detection of MW from eye-tracking signals. To evaluate relevance for both self-report and thought-probe paradigms, we additionally simulated thought probes by randomly sampling time points during reading. Logistic regression classifiers trained on eye-tracking features extracted from time windows anchored to MW onset achieved AUROC scores of 0.659 and 0.621 under the self-report and simulated thought-probe paradigms, respectively, using leave-one-subject-out cross-validation. In both cases, onset-aligned windows outperformed classifiers trained using arbitrary MW windows. Sliding-window analyses further revealed systematic temporal changes around MW onset, with classification performance peaking at approximately 3 seconds after onset. Feature-level analyses showed reduced fixation rate and fixation dispersion, along with increased pupil size following MW onset. Together, these findings characterize the temporal progression from on-task reading to MW. Overall, ReMind provides a useful framework for studying the temporal dynamics of MW during naturalistic reading.

12
A Hybrid Quantum-Classical Multiscale LSTM Framework for Subject-Level EEG-Based Depression Detection

E, S.; Wang, C.; Rao, T. D.; Kumar, T. S.

2026-05-20 psychiatry and clinical psychology 10.64898/2026.05.18.26353461 medRxiv
Top 0.3%
1.8%
Show abstract

Major depressive disorder (MDD) is a common psychiatric disorder that requires reliable and objective assessment for early clinical intervention. Electroencephalography (EEG) is widely used for this purpose because it provides a non-invasive and low-cost measure of brain activity with high temporal resolution. However, EEG-based depression detection remains challenging due to the nonlinear nature of EEG signals, inter-subject variability, and the limited availability of subject-independent evaluation. To address these issues, this paper proposes a hybrid quantum-classical multiscale long short-term memory with parameterized quantum circuit branches (MS-LSTM-PQC) framework for subject-level EEG-based depression detection. The proposed model extracts temporal representations at multiple scales using parallel LSTM branches and incorporates eyes-closed (EC) and eyes-open (EO) condition information through condition-aware feature fusion. To further enhance the learned representations, scale-specific LSTM features are processed using PQC-based quantum branches implemented with TensorFlow Quantum (TFQ), providing an additional nonlinear feature transformation before classification. Experiments were conducted on the Mumtaz EEG depression dataset using EC-only, EO-only, and merged EC+EO conditions with 1-s, 2-s, and 3-s EEG windows. To reduce subject-level data leakage, all experiments were evaluated using 5-fold and 10-fold GroupKFold validation. The best overall accuracies across the evaluated settings were 92.05% and 95.08% under 5-fold and 10-fold GroupKFold validation, respectively. The 2-s merged EC+EO setting provided the most stable performance across validation protocols. In addition, Integrated Gradients (IG)-based explainability analysis showed that frontal and fronto-central channels, especially Fz, showed higher contributions to the model decision. These results suggest that multiscale temporal learning with quantum-enhanced feature transformation can support subject-level EEG-based depression detection under leakage-controlled evaluation.

13
Signal Quality Screening and Automated Sleep Stage Agreement in Home EEG: A Systematic Comparison of Dreamento and YASA on the Wearanize+ Dataset

Parry, Y. D.; Briganti, G.

2026-06-03 neurology 10.64898/2026.06.01.26354591 medRxiv
Top 0.4%
1.7%
Show abstract

Wearable EEG devices such as the Zmax headband offer scalable alternatives to laboratory polysomnography (PSG) for sleep monitoring, but their real-world performance in home settings remains poorly characterised. This study presents a systematic validation of automated sleep staging on the Wearanize+ dataset; a unique multimodal resource providing synchronised full PSG, bilateral Zmax EEG (F7-Fpz/F8-Fpz), and psychiatric phenotyping from 100 participants recorded at home. We first developed and applied an automated signal quality screening framework, revealing that 10% of recordings failed completely due to signal dropout and a further 16% showed partial degradation. We then evaluated two automated staging algorithms; Dreamento and YASA against PSG manual scoring, stratified by signal quality. In technically adequate recordings (N=74), YASA achieved significantly higher agreement than Dreamento (mean {kappa}=0.450 vs 0.371; {Delta}{kappa}=+0.079, p=0.0005), primarily through substantially improved N2 detection (recall: 0.64 vs 0.36). Both algorithms showed a systematic N2/N3 boundary confusion, however in opposite directions: Dreamento over-called N3 (37% of N2 epochs mis-staged as N3), while YASA over-called N2 (35% of N3 epochs mis-staged as N2). Critically, Dreamento showed greater robustness than YASA in degraded-quality recordings (WARN group: {kappa}=0.414 vs 0.330), consistent with its training on Zmax-specific data. Signal quality metrics did not predict staging performance within adequate recordings, indicating that channel topology is the primary limiting factor for frontal single-channel staging. These findings establish the Wearanize+ dataset as a benchmark for wearable sleep staging and motivate the use of PSG manual stage labels for downstream physiological analyses.

14
From CCTA to Surgical Strategy: An Integrated AI Framework for Patient-Specific Coronary artery bypass grafting Planning

Rezaeitaleshmahalleh, M.; Masoumi, S.; Debalme, E.; Sundt, T. M.; Aranki, S. F.; Shin, B.; Nezami, F. R.

2026-06-01 cardiovascular medicine 10.64898/2026.05.28.26354400 medRxiv
Top 0.4%
1.7%
Show abstract

Background: Coronary artery bypass grafting (CABG) remains the standard of care for complex multivessel and left main coronary artery disease. However, current preoperative planning remains largely subjective, relying on qualitative interpretation of coronary CT angiography (CCTA), operator-dependent stenosis grading, and fragmented multi-software workflows. Invasive fractional flow reserve (FFR), the reference standard for physiologic lesion assessment, is infrequently acquired preoperatively, leaving distal anastomosis planning without an objective hemodynamic basis. Methods: We developed a fully automated, AI-powered platform that converts routine CCTA into a patient-specific CABG planning workflow through five integrated modules: nnU-Net based segmentation of coronary lumen and calcification; quantitative morphological and topological characterization generating more than thirty descriptors; automated stenosis detection using a local reference-radius formulation; a nine-point composite scoring framework for distal anastomosis site selection incorporating luminal caliber, landing-zone length, calcification burden, distal perfusion reserve, and bifurcation proximity; and interactive virtual graft construction coupled to a distributed reduced-order solver for pre- and post-bypass FFR estimation. Results: Lumen segmentation achieved a mean Dice similarity coefficient of 0.96 {+/-} 0.01, whereas calcium segmentation achieved 0.73 {+/-} 0.15 on the held-out cohort. Platform-derived FFR demonstrated strong agreement with invasively measured FFR (r=0.96, mean absolute relative difference 1.73 {+/-}1.42%) across the evaluated lesions, supporting the physiologic validity of the reduced-order hemodynamic solver. End-to-end analysis from raw CCTA to hemodynamic assessment and virtual graft planning was completed in approximately seven minutes per case on a standard workstation, representing a substantial reduction in processing time compared with conventional multi-tool and CFD-based workflows. Conclusions: The proposed platform demonstrates the feasibility of rapid, reproducible, and physiology-informed CABG planning using routine CCTA. By integrating anatomical characterization, automated target-site analysis, virtual graft construction, and reduced-order hemodynamic assessment into a single workflow, the framework provides objective, quantitative surgical decision support compatible with routine clinical workflows. Keywords: Coronary artery bypass grafting (CABG); Fractional flow reserve (FFR); Coronary CT angiography (CCTA); Surgical planning

15
Facial Skin Blood Flow Enhances the Human Likeness of Artificial Agents

Nikaido, S.; Isomura, T.

2026-05-26 physiology 10.64898/2026.05.22.726810 medRxiv
Top 0.4%
1.6%
Show abstract

Recent studies have shown that implementing explicit social cues, such as gaze, facial expressions, and gestures, in artificial agents can improve impressions of these agents. However, humans may also use implicit physiological cues, such as facial coloration and cardiac information, in social perception. The present study examined whether subtle skin color changes reflecting pulse signals enhance the perceived human likeness of artificial agents, and whether this effect depends on agent type, signal type, observers interoceptive sensibility, and their awareness of the skin color changes. Participants observed morphed face stimuli created from artificial agents and human faces and judged whether each stimulus appeared human-like or robot-like. In Experiment 1, skin color changes based on human-derived pulse wave signals enhanced perceived human likeness for a highly human-like agent, but not for a less human-like agent. In Experiment 2, perceived human likeness was enhanced not only by pulse-based skin color changes but also by sinusoidal skin color changes matched to the pulse wave signal in terms of mean amplitude and number of peaks. In addition, participants with higher scores on some subscales of the Multidimensional Assessment of Interoceptive Awareness (MAIA), a subjective measure of interoceptive sensibility, tended to notice the skin color changes. However, neither observers interoceptive sensibility nor their awareness of skin color changes directly explained the enhancement of perceived human likeness induced by skin color changes. These results suggest that subtle skin color changes reflecting pulse wave information may function as implicit dynamic cues signaling embodiment or biologicalness in artificial agents, thereby contributing to perceived human likeness.

16
Evaluating Long-Range Temporal Structure in Foundation Model-Based Forecasts of Heartbeat Dynamics

Serapio, A.; Ramsundar, B.; Subramanian, S.

2026-05-28 bioinformatics 10.64898/2026.05.25.727760 medRxiv
Top 0.4%
1.6%
Show abstract

We examine the long-range temporal structure of forecasts produced by Time-Series Foundation Models (TSFMs) on heartbeat dynamics using the MIT-BIH Normal Sinus Rhythm Database (NSRDB). Our findings indicate that these models do not adequately capture long-range dependencies, as reflected in growing errors in RR-interval predictions over longer forecast horizons. Code is available at https://github.com/SubramanianLab/ecg-tsfm-benchmark.

17
Next-Generation Skin Cancer Detection Using Efficient Fuzzy Fusion of Genomic and Imaging Data

Molla, A. R.; Maity, A.; Saha, S.; Bhattacharya, R.; Chakraborty, A.; Biswas, S.; Nath, S.

2026-06-08 health informatics 10.64898/2026.06.05.26355024 medRxiv
Top 0.4%
1.5%
Show abstract

Skin cancer requires early detection for improved survival rates. Most existing methods rely on deep learning based image classification, which is affected by visual similarity among lesions. Fewer studies use Gene Expression (GE) analysis, which captures molecular characteristics but lacks structural and visual details. To overcome limitations of individual modalities, this paper proposes a multimodal framework integrating dermoscopic images and GE profiles for skin cancer classification. EfficientNet and logistic regression are used for image based analysis and genomic skin lesion profiling, respectively, followed by fuzzy rule based decision systems to reduce uncertainty within individual modalities. Finally, fuzzy fusion combines predictions from both modalities using uncertainty based weighting of classifier outputs. The experimental findings show that both the image based and GE based classification models individually achieved accuracies of nearly 92%. However, the integration of prediction results through the proposed fuzzy fusion strategy further enhanced the classification performance, achieving an overall accuracy of 94.25%. The results obtained outperform contemporary methods, highlighting the effectiveness of combining complementary multimodal information compared with single modality approaches.

18
Oxygen-based endotypes of Obstructive Sleep Apnea

Wellman, A.; Messineo, L.; Azarbarzin, A.; Esmaeili, N.; Aishah, A.; Vena, D.; Sumner, J.; White, D.; Sands, S.

2026-06-04 respiratory medicine 10.64898/2026.06.03.26354835 medRxiv
Top 0.4%
1.5%
Show abstract

Objective: Several endotypes contribute to the development of Obstructive Sleep Apnea (OSA). However, efforts to measure these endotypes have been challenging. In this paper, we propose a new method that overcomes some of these challenges. Methods: To test the feasibility of this new method, data from the Sleep Heart Health Study (SHHS) were analyzed and two oxygen-based endotypes were identified and plotted on a graphical model: the steady-state SpO2 and the SpO2 arousal threshold. The first is the oxygen saturation that would occur during sleep if there were no arousals, and it is a measure of upper airway collapsibility (a more collapsible airway produces a lower SpO2). The latter is the oxygen saturation that triggers arousals. These endotypes were validated by assessing their ability to detect positional and state-related changes in airway collapsibility and arousal threshold. Results: The study showed that it was feasible to measure oxygen-based endotypes in 95% of SHHS participants. As expected, steady-state SpO2 was lower during supine vs. non-supine sleep, as well as during REM vs. NREM sleep. Also, the SpO2 arousal threshold was similar between supine and non-supine sleep. However, SpO2 arousal threshold was not lower in REM sleep vs. NREM sleep. Therefore, in 3 of the 4 conditions, the oxygen-based endotypes moved in the expected direction due to positional or sleep state changes. Conclusion: Although further validation experiments are required, this study indicates that OSA endotyping using the pulse oximetry signal is feasible. The oxygen-based endotypes could be used to aid therapeutic decision making.

19
DINMC: A Deep Learning Framework for Interpretable Normative Model Construction and Pathological Brain Alteration Detection

Ge, Z.; Liu, S.; Dou, W.

2026-05-29 bioinformatics 10.64898/2026.05.29.728652 medRxiv
Top 0.4%
1.5%
Show abstract

Background and ObjectiveNormative modeling is a key tool for understanding brain alterations in neurodegenerative diseases, such as cerebellar-type multiple system atrophy. However, existing methods lack interpretability and fail to capture clinically meaningful pathological changes. This study presents DINMC, a Deep Interpretable Normative Model Construction framework, which combines autoencoder-based learning with statistical hypothesis testing to better capture and interpret disease-specific neu-roanatomical changes. MethodsThe DINMC framework constructs normative models using neuroimaging data from multi-site large healthy cohorts. It utilizes a U-shaped convolutional autoencoder to train these models, which are then applied to reconstruct brain features from both patients and healthy controls within the same study cohort. Pathological confidence values are derived by fusing original and deviation feature spaces, offering a measure of disease-related pathology reflected in each dimension of the features. The framework was validated through statistical analysis and prognostic classification and regression tasks. ResultsThe pathological confidence provides valuable insights into the neuroanatomical regions most affected by the disease, as well as the correlation between changes in these regions and clinical assessment scales. Our optimal model outperform traditional methods in prognostic prediction tasks, with an AUC of 0.972 for classification tasks and an R2 of 0.432 for regression tasks. ConclusionDINMC provides a novel and interpretable framework for neuroimaging analysis. By combining deep learning and statistical hypothesis testing, this framework offers a unique solution to improving both the interpretability and performance of normative models in neuroimaging. The approach is scalable to other neuroimaging datasets, offering a versatile tool for broader biomedical applications.

20
A Supervised Learning Framework for Stroke Hospitalization Factors Selection Using the Lasso-MIDAS Model

Li, Q.; Wang, L.

2026-05-20 cardiovascular medicine 10.64898/2026.05.15.26353365 medRxiv
Top 0.5%
1.4%
Show abstract

Stroke, as an acute cerebrovascular disease with significant public health implications, is influenced by a complex interplay of meteorological conditions, air quality, and socioeconomic factors. However, the inherent challenges of mixed-frequency data from diverse sources and high-dimensional variable spaces limit the effectiveness of traditional regression models. This study develops a Lasso-MIDAS model framework to identify the key multidimensional drivers of stroke admissions. Using this approach, 21 candidate variables encompassing meteorological, environmental, and economic indicators were screened. The empirical results identified 11 core influencing factors. In the meteorological and environmental dimensions, Wind Speed, Carbon Monoxide (CO), and Sulfur Dioxide (SO2) were identified as significant positive drivers, with Temperature Difference also positively correlating with admission risks. Conversely, Nitrogen Dioxide (NO2) exhibited a negative correlation, potentially reflecting behavioral adaptation and exposure reduction during peak pollution periods. In the socioeconomic dimension, the Consumer Price Index (CPI) for Food, Tobacco, and Alcohol emerged as a major risk factor, highlighting the impact of living cost pressures on public health. The findings demonstrate the superiority of the Lasso-MIDAS model in handling large-scale healthcare data. It effectively addresses the frequency mismatch problem while enhancing the robustness of causal identification through variable shrinkage. These conclusions provide a scientific basis for health authorities to establish early warning systems and optimize public health policy interventions.